Advertisement Based on Speech Recognition

ABSTRACT

This invention relates to a system and method for providing dedicated television advertisements based on speech recognition of telephone conversations. The home user makes telephone calls using a landline, cellular phone or VoIP phone. The user may also be watching TV. While doing so, the user is able to see advertisements on part or all of his TV screen, based on words and phrases he may have used during his telephone conversation(s). The system and method may be provided by a triple play or quad play service provider which associates its users&#39; telephone calls with TV advertising. Once the user sees the advertisement he can get more information concerning the advertisement by clicking on it with a pointing device, e.g., a mouse or TV remote control, or by using any other navigation method the TV system provides, to see additional details concerning the advertisements.

This application claims priority to U.S. Provisional Application Ser.No. 60/865,171, filed Nov. 10, 2006, the entirety of which isincorporated herein by reference.

FIELD OF INVENTION

This invention relates to a system and method for providing directedtelevision advertisements based on speech recognition of telephoneconversations.

BACKGROUND

Internet as well as other types of advertisement are a method forincreasing the awareness and sales of their products, goods, and ideas.Internet site owners, for example, may use their Internet site foradvertisements to generate profits for them by the advertisers, and canactually finance other activities. Internet users exposed toadvertisements may become potential buyers of the advertised products orideas. The Internet can be a doorway to generate awareness of a productall over the world.

Text-messaging voting (SMS) is becoming popular. For example, userswatching a TV show can vote in response to polling by the TV show bysending text-messages using a cellular phone. Alternatively, users mayplace telephone calls to an Interactive Voice Response (IVR) system, andby pressing numbers on the telephone keypad, which generate dual-tonemulti-frequency (DTMF) tones, can participate in the TV show by voting,playing, or changing some scenarios of the TV show.

SUMMARY OF INVENTION

The present invention creates new and useful innovative advertisingtechniques for advertisers. In one embodiment, while a user talks withhis landline phone, or cellular phone, the service provider using thepresent invention recognizes some of the words and phrases the usersays. The system may use a speech recognition technology to displayrelevant advertisements on the user's TV screen, based on the recognizedwords used in the call, and by using speech recognition,

The present invention may also provide the capability for “triple play”and “quad play” service providers to increase revenues through targetedadvertising. This may allow them to provide reduced service rates fortheir subscribers (end users), or even to provide some of the servicesfree of charge.

In one embodiment, speech recognition activity is only performed withthe user's approval, prior to any such call or system use. The goal isfor advertisements activities for the users only, and is not meant toharm any user privacy, or to expose the call information to thirdparties. In addition to advertisements, the present invention maydisplay other relevant information related to the words used in thevoice call.

Other objects, features, and advantages of one or more embodiments ofthe present invention will be enabled from the following detaileddescription, and accompanying drawings, and the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention will now be disclosed, by way ofexample only, with reference to the accompanying schematic drawings inwhich corresponding reference symbols indicate corresponding parts, inwhich:

FIG. 1 shows a block diagram of a system architecture, in accordancewith an embodiment of the present invention;

FIG. 2 shows a speech recognition system and how it interacts with otherelements in the system, in accordance with an embodiment of the presentinvention; and

FIG. 3 shows a flowchart for generating television advertisements basedon a telephone conversation, in accordance with an embodiment of thepresent invention.

DETAILED DESCRIPTION OF INVENTION

FIG. 1 shows a block diagram of a system architecture, in accordancewith an embodiment of the present invention. Triple play serviceproviders provide three services to their subscribers (end users):television, telephony and Internet. Quad play service providers furtheradd cellular telephony services and/or other services. It will beappreciated that while a triple/quad play service provider may be asingle entity (or appear to the subscriber to be a single entity), thatin actuality, the various services may be provided by multiple serviceproviders who are affiliated, or even bundled together by a third party.In one embodiment, the triple/quad service provider is a broadbandservice provider.

The concept of a triple/quad play service subscriber is depictedgenerally by element 50. The subscriber's premises 50 includes, forexample, a TV set 18 connected to a Set Top Box (STB) 19 and a regulartelephone 21 connected to a gateway and cable modem 20. The TV set 18may include any monitor or display device capable of displaying atelevision or video signal (e.g. LCD, DLP, plasma, CRT, etc.). The STB19 may decode and/or decrypt the video feed signal from the serviceprovider for input to the TV set 18. The subscriber's premises 50 mayalso include Internet services to a computer (not shown) and/or cellularservice (not shown) as part of the triple/quad play service package.

The subscriber's premises 50 is connected to the triple/quad playservice provider via a fiber-optic or coaxial network 13. In otherembodiments, other networking technology may be employed, such asfixed-line, satellite, wireless, cellular, etc. Television may be fedfrom a cable TV conditional access system (CAS) 10 via the Fiber/Coaxnetwork 13 to the subscriber's STB 19 and TV set 18. In one embodiment,CAS 10 may be an Internet Protocol Television (IPTV) service provider.

Telephony service is provided via the Fiber/Coax network 13 to theregular telephone 21 connected to a gateway and cable modem 20. In anembodiment, the subscriber (user) initiates a regular phone call usinghis phone 21, to a remote telephone 12 connected to the public switchedtelephony network (PSTN) 8. While a call between two persons isgenerally envisioned, it will be appreciated that any numbers of personseach using a different phone may participate (e.g., call-waiting,party-lines, multi-part calls, teleconferencing, or the like). The callparticipants may use the same service providers, but they do not haveto.

The telephone conversation may be initiated at the subscriber's premises50 via Voice over Internet Protocol (VoIP). VoIP allows telephone-likevoice conversation to be routed over the Internet and/or any otherIP-based network. VoIP data packets may be transferred through a dataswitch 11 to the PSTN 8 via a gateway 9. The phone 21 may be an IPphone, VoIP phone, IP-phone client application (or “softphone”), or anyapplication for making calls over an Internet or IP-based network. VoIPdata packets may be transmitted, for example, using Real Time Protocol(RTP). In one embodiment, the softphone runs on an Analog TerminalAdaptor (ATA) provided with an analog phone.

In other embodiments, the phone 21 may use Plain Old Telephone Service(POTS). For example, the telephone conversation may be initiated at thesubscriber's premises 50 via the PSTN, and converted to VoIP by theservice provider. In other embodiments, the phone may be a cellularphone connected to the service provide through a cellular network. Thecellular phone 10 may be configured to make/receive VoIP calls or may bemade VoIP-enabled.

Once the call is initiated, a VoIP softswitch 6 may send call detailrecords (CDR) to a billing system 7. The CDR may contain detailedinformation relating to a single call or session passing through thesoftswitch 6. Accounting software in the billing system 7 processes theCDRs, and produces bills for subscribers.

The billing system 7 enables the call, and sends the caller information(e.g., the subscriber's identifier and IP address) to a speechrecognition system 5. The data switch 11 mirrors the data packets sentthrough it and transmits them to the speech recognition system 5. Duringthe conversation, the call is detected and monitored by the speechrecognition system 5.

In one embodiment, the voice recognition activity is performed only withthe user's approval, prior to any such call or system use. The presentinvention is not meant to harm any users' privacy, or to expose callinformation to third parties, without their consent.

The speech recognition system 5 is configured to recognize key words andphrases from the conversation speech media, and perform a databaselookup to get advertisements associated with them. In one embodiment,the speech recognition system 5 provides relevancy scores for the wordand phrases identified in the conversations and data items found in thedatabase. For example, the speech recognition system 5 may return theN-th highest scored items, which may be sent to the CAS 10 to becombined in the subscriber's video feed.

The speech recognition system 5 also sends the subscribers'identification that the text items are associated with. The CAS 10generates a video feed to the subscriber's premises 50 via theFiber/Coax network 13. For example, the CAS 10 receives video contentfrom various remote content providers via a satellite dish 3. Inaddition to the satellite dish 3, the CAS 10 may receive content througha Playouts 2, which receives content from local or remote broadcasters,e.g., through fixed-land lines, fiber-optics, or microwavetransmissions, etc. The CAS 10 may also has access to a Video of Demand(VoD) server 1 storing video content, which can be accessed remotely bysubscribers through a VoD service provided by the triple/quad playservice provider.

The VoD server 1 streams video content (e.g., TV shows, movies, sportingevents, etc.) in response to users interactively requesting suchcontent. Users may interactively select content with a pointing device,e.g., a remote control, mouse, stylus, or by using any other navigationmeans via VoD 1. This logic path is depicted by reverse path 14.

Video feeds from the VoD service 1, the Playouts 2, and the satellitedish 3 may be combined along with advertisements from the speechrecognition and advertisements search system 5 to form a single videofeed for a particular subscriber. Modulators 4 may adjust thefrequencies of the video feed, as necessary. The combined video feed isthen transmitted via the cable Fiber/Coax network 13 to the subscriber'sSTB 19 and TV set 18.

FIG. 2 shows a speech recognition system and how it interacts with otherelements in the system, in accordance with an embodiment of the presentinvention. When a call is initiated, the VoIP softswitch 6 sends calldetail records (CDR) to a billing system 7. The CDR may then be sent bythe billing system 7 to the data manager 28. The CDR may include thesubscriber's identifier and the IP address of the telephone generatingthe call. The subscriber's identifier may be any unique number, code,identifier, or the like, for identifying a particular subscriber.

The data manager 28 provides a media handler 22 with the IP address ofthe phone 21 so that its speech media, i.e., its data packets will bemonitored. The data manager 28 may also send the telephone's number orIP address and subscriber identifier to a word analyzer 24. This datamay later be sent on to the CAS 10.

The speech media may be sent by the VOIP softswitch 6 and may bereceived by the media handler 22. In one embodiment, RTP data packetsare used and which can be of any type, for example, Codecs G.711, G.729,G.723, iLBC, GSM, JPEG, etc. The data packets may then be sent to thespeech recognition engine 23. In one embodiment, the media handler 22may first convert the codecs of the data packets, if they are notalready supported by the speech recognition engine 23. In anotherembodiment, the data packets from multiple speech streams usingdifferent codecs may be converted to a common codec by the media handler22.

The speech recognition engine 23 is configured to recognize words andphrases within the speech media. It will be appreciated that speechrecognition engine 23 may utilize any speech recognition algorithm,including, for example, any one of the well known speech recognitionengines currently on the market, such as: Nuance VoCon® family or NSCSpeechblades™.

The words and phrases recognized by the speech recognition engine 23 maybe sent to the word analyzer 24. For example, the speech recognitionengine 23 may transmit a text file or electronic file including theparticular word and phrases recognized in the speech media. Variousfiltering techniques may be employed to return only relevant data. Inone embodiment, the speech recognition engine 23 may only send aparticular word or phrase recognized in the speech media to the wordanalyzer 24, if it is located on a master list of key words and phrases.Thus, other non-essential jargon, words and phrase used in a typicaltelephone conversation can simple be ignored.

The word analyzer and database query engine 24 performs a database queryin a local database 25 and optionally in one or more remote databases26. Remote database 26 may be maintained by a third party (e.g.,advertiser, context provider, affiliate, etc.). The word analyzer anddatabase query engine 24 is configured to generate an appropriatelyformatted query for the advertising databases 25, 26, for example, usinga structured query language (SQL).

The local database 25 and remote databases 26 may be, for example,rational databases and populated with various sources of advertisements,including commercials, video, text, photos, graphics, sounds, music,etc. The databases entries have been previously categorized according totheir content.

Advertisers or marketers may pre-screen and index advertisements forparticular demographics or other marketing metrics. Further, eachadvertisement may be associated or indexed with a particular set ofwords and phrases, which may later be matched with search results fromthe word analyzer and database query engine 24.

In other embodiments, the databases 25, 26 may be populated with otherinformation, not necessarily related to commercial advertisements, forexample, public service announcements (PSA), government warnings, etc.

Any database query algorithm and filtering may be used. In one example,query results are returned to the word analyzer and database query 24,which sorts through the results according to relevancy scores given toeach one, according to meta-data queried from the local database 25 andthe remote databases 26. In another embodiment, the N-th most relevantadvertisements are sent to the CAS 10, to be combined in the video framedata and displayed on the relevant subscriber's TV set 18. In someembodiments, the relevant advertisements may be required formattingaccording to the CAS 10 requirements. CAS 10 may require video streamonly in high-definition (HD), for example.

The data manager 28 also sends the subscriber's identifier to the CASsystem 10, so the CAS will be able to associate the advertisements witha particular subscriber.

In one scenario of the invention, two persons might be having aconversation regarding a child's upcoming birthday party. The speechrecognition engine 23 may recognize the words “4^(th) birthday,” “party”and “gift” within the speech media from a master list of key words andphrases, and return them to the word analyzer and database query engine24. In response to a query to local database 25, a television commercialfor a local toy store may be retrieved and transmitted to CAS 10 forincorporation into the video feed.

In another scenario, two persons may be having a telephone conversationand one person mentions that she is interested in buying a newtelevision set. Upon recognizing the words “buying,” “new” and“television,” a television commercial for a plasma television might bebroadcasted on her TV set 18.

In other embodiments, the speech recognition engine 23 may be configuredto recognize the gender, language, or other traits of the personsspeaking by analysis of their voice frequencies, tones, pitches,amplitude, patterns, etc. This may be further used to target commercialand other advertising. For example, if the speech recognition engine 23recognizes that the persons are speaking in Spanish, advertisements andcommercials may be broadcasted in Spanish. In another embodiment, thespeech recognition engine 23 may be able to identify different persons(users) within the subscriber's household, and select advertising moreaccordingly.

In another embodiment, advertisers may provide different commercials forthe same product/service, which are directed towards differentdemographics. For example, in the remote database 26 there may be twocommercials for the same car. Commercial 1 may be directed towards toaudience of men, and Commercial 2 may be directed towards an audience ofwomen. Both commercials may be indexed in the database, e.g., under thesame words and phrases such as “buying,” “new” and “car.” However, basedon recognition that the persons speaking are likely women, Commercial 2may be advantageously selected, rather than Commercial 1.

Further targeting may be coupled with subscribers' profile information.When a subscriber signs up for the inventive advertising service,additional information may be collected, such as age, address,employment, income, marital status, etc. The service provider mayprovide a website, calling center, kiosks, or other means for signing-upsubscribers. New subscribers may be asked a number of questions forcollecting marketing data. In addition, follow-ups from customer servicerepresentatives, surveys, emails, etc., may be used by the serviceproviders to collect information. Other data mining techniques mayretrieve further information regarding the subscriber, from thirdparties, affiliates, etc.

Using this additional information may further allow marketers andadvertisers to better target subscribers. For example, in response to aperson discussing an interest in buying a new car, commercials forluxury cars might be directed towards to the subscriber—not only byrecognizing words and phrases such as “buying,” “new” and “car,” butcoupled with knowledge from a survey the subscriber is in more affluentincome range.

The above scenarios are merely representative and are not meant to belimiting. The various embodiment discussed may be combined to furthertargeted advertising, based on a number of factors.

FIG. 3 shows a flowchart for generating television advertisements basedon a telephone conversation, in accordance with an embodiment of thepresent invention.

Beginning in step 31, a telephone call is initiated by a subscriber(user) using the phone 21 at the subscriber's premises 50 to a remotephone 12 via PSTN 8. In other embodiments, the call might be initiatedfrom the remote phone 12 to the subscriber's phone 21.

In step 32, once the call is connected, a CDR record is sent from theVoIP softswitch 6 to the billing system 7. The CDR may include thesubscriber's identifier and the IP address of the telephone generatingthe call.

Next, the step 33, the billing system 7 recognizes the subscriber'sidentifier, and the television broadcasts associated with him.Continuing on to step 34, the billing system 7 checks whether thesubscriber is associated with the inventive advertisement program.

In one embodiment, the voice recognition activity is performed only withthe user's approval, prior to any such call or system use. The presentinvention is not meant to harm any users' privacy, or to expose callinformation to third parties, without their consent. In anotherembodiment, the system may only monitor the speech media of only thesubscriber's phone and not the speech media for non-subscribers duringthe telephone conversation. Also, the system may announce to the user(and even all call participants) that the telephone conversation isbeing monitored.

In other embodiments, subscribers may receive financial incentives foropting into the advertising program. For example, subscribers mightreceive telephone call credits, service fee reductions, or even money orcash equivalents, if they opt into the inventive advertising service,Moreover, subscribers may receive other advantages from retail merchantsand/or ecommerce websites affiliated with the service provider.

If the subscriber has given authorization to participate in theadvertisement program, the method proceeds to step 35; otherwise, themethod may end.

In step 35, the billing system 7 may send the subscriber's identifierand the subscriber's IP address to the speech recognition system 5.Next, in step 36, the speech recognition system 5 may monitor the voicedata packet stream from the telephone conversation associated with thesubscriber according to his IP address, if using VoIP phone orsoftphone. In other embodiments, the subscriber's telephone number maybe used. The speech recognition system 5 uses a speech recognitionalgorithm that is configured to recognize key words and phrases.

Continuing to step 37, based on key words and phrases in theconversation, the word analyzer 24 finds relevant advertisements and/orother information associated with them. Relying on the subscriber'sprofile as well as the key words and phrases used within telephoneconversations, advertisements may be selected for the subscriber fromlocal database 25 or remote databases 26. In one embodiment, the wordanalyzer and database query engine 24 may return the N-th most relevantdatabase entries.

In Step 38, speech recognition system 5 formats the advertisements tomeet the requirements for a particular CAS 10. Different CASs 10 may beassociated with service providers, each having different requirementsfor format and content. In one embodiment, the advertisements may beconfigured to fill only a part (e.g., a header or footer) or the entireTV screen.

Next, in step 39, the speech recognition system 5 may send theadvertisements to the CAS 10 along with the subscriber's identifier. Instep 40, the CAS 10 may combine the advertisement with the video feeddata, for example, from the VoD sever 1, Playouts 2, and/or satellitedish 3, via modulators 4. The combined video feed may be sent to thesubscriber's STB 19 and the TV set 18 via Fiber/Coax network 13. Fromthe user's point of view, the advertisements appear to be seamlesslyinserted within the television feed.

In one embodiment, once the user sees the advertisement on his TV set18, he can get more information concerning the advertisement by clickingon it with a pointing device, e.g., a remote control, a mouse, stylus,or by using any other navigation means to view or obtain additionaldetails concerning the advertisements. TV set 18 may be coupled via STB19 to an Internet services provider. For example, the advertisement maycomprise a hyperlink link, e.g., a Uniform Resource Locator (URL), whichthe user can click on with his pointing device and be link to a Internetwebsite whether he get more information. In an embodiment, by selectingthe advertiser, the user may be able to send an email to the advertiser.Similarly, the user may activate or cancel the advertisements displayedon his TV set while during a phone call by using the TV remote control.For example, in one embodiment the user may activate or cancel theadvertisements displayed on his TV by dialing specific codes on thephone keypad during the call. Those codes will be transmitted by thephone using DTMF signals to the system. The user may also call theservice provider's interactive voice response (IVR) system and activateor deactivate the advertisement service, or change his profile settings.In other embodiments, the user may browse, via the web to the serviceprovider's portal, set his profile, and activates or deactivates theadvertisements activity.

The billing system 7 may get indications from the CAS 10, concerning thesubscriber's TV set 18 status via reverse path 14. For example, if theTV set 18 is turned “off,” the billing system 7 may prevent the CAS 10from integrating advertisements into that subscriber's video feed and/orthe system may not assign additional benefits to the subscriber in termsof services cost reductions, credits, etc.

In one embodiment, a sensor may be provided (e.g., in the STB 19 orstand-alone) which is configured to determine whether a wireless phoneor mobile device is near the TV set 18. For example, if the wirelessphone or mobile device used for the telephone call is away from the TVset 18, e.g., in another room or location altogether, it may not benecessary to integrate advertisements into that subscriber's video feed.In one embodiment, the sensor may track wireless signals to and fromwireless phones or mobile devices. The sensor or STB 19 may then send anindication to the CAS 10, indicating that a wireless phone or cellularphone is active and is near the TV set 18. This practice allows thesystem not to present advertisements when the wireless phone or mobiledevice is not in proximity of the TV set 18.

The advertisements may be broadcasted during the telephone conversationor some time thereafter. In other embodiments, advertisements concerningthe telephone conversation may be displayed on the TV sets of a group ofusers of the same service provider. Similarly, advertisements associatedwith conversations for multiple users may be displayed on the TV sets ofa group of subscribers of the same service provider. In anotherembodiment, advertisements associated with conversations for multipleusers may be displayed on the TV sets of a group of subscribers ofdifferent service providers.

In other embodiments, multiple users' conversations will be displayed ona specific TV set, using a relevant algorithm. For example, where two ormore phones are associated with the same subscriber's premises 50, whichhas only one TV set 18, the system may prioritize the advertisements tobe displayed on the TV based on the different profiles associated withthe two phones, or based on a specific voice character recognized on oneof the phone lines, by the system.

While this invention has been described in connection with what ispresently considered to be the most practical and preferred embodiment,it is to be understood that it is capable of further modifications andis not to be limited to the disclosed embodiment, and this applicationis intended to cover any variations, uses, equivalent arrangements oradaptations of the invention following, in general, the principles ofthe invention and including such departures from the present disclosureas come within known or customary practice in the art to which theinvention pertains, and as may be applied to the essential featureshereinbefore set forth and followed in the spirit and scope of theappended claims.

1.-29. (canceled)
 30. A method for providing television advertisementsbased on a telephone Call between two of more callers, the methodcomprising: monitoring a call between two or more callers in order torecognize key words and phrases spoken by one or more of the callersduring the call; recognizing key words and phrases spoken by one or moreof the callers during the call; querying a database having one or moreadvertisements indexed by words or phrases or both, the query based onkey words or phrases or both recognized during the call; and during thecall or after the call, displaying on a display device an advertisementfrom the database identified by the query.
 31. The method of claim 30wherein the display device receives a video feed from a conditionalaccess system and wherein the display device is configured to receiveand display Internet Protocol Television or video content or both. 32.The method of claim 30 further comprising: using a speech recognitionengine to recognize one or more of: words or phrases or caller traits,during the call; sending the recognized words or phrases to a wordanalyzer for filtering the words; storing words or phrases or traits inassociation with a caller subscriber for use with a later advertisementtransmission.
 33. The method of claim 30 wherein the call involves threeor more callers.
 34. The method of claim 30 wherein the call uses Voiceover Internet Protocol and is made using an IP phone, or a soft phone,or an IP-phone client application or an Analog Terminal Adaptor.
 35. Themethod of claim 30 further comprising: recognizing one or more traits ofa caller, the traits comprising one or more of the following: gender,language, voice frequency, tone, pitch, and amplitude.
 36. The method ofclaim 30 wherein the advertisement is displayed after the call and thecall is transported over a packet network or a plain old telephoneservice.
 37. The method of claim 30 wherein the display device isreceiving a video signal from a Conditional Access System, theconditional access system providing the display device the video signalfrom one or more of the following: fiber, coaxial cable, satellitenetwork, a cellular network, a wireless network, or a fixed line. 38.The method of claim 30 wherein the call is initiated by a subscriber viaVoice over Internet Protocol.
 39. The method of claim 30 wherein acaller is a subscriber and further comprising: selecting anadvertisement to display for a subscriber based upon a profile of thesubscriber and selecting an advertisement to display based upon words,phrases and recognized traits previously used by the subscriber during acall.
 40. The method of claim 30 wherein a caller is a subscriber andfurther comprising: relying on a subscriber's profile and key words andphrases used within several calls, to select an advertisement to displayfor the subscriber.
 41. The method of claim 30 wherein a caller is asubscriber and further comprising: maintaining profile settings of asubscriber, the profile settings used for providing advertisements tothe subscriber when the subscriber is on a call or after the call, theprofile settings changeable by the subscriber or by a service provider.42. The method of claim 30 wherein the advertisement is displayed afterthe call and the call is made using a cellular service or a wirelessdevice or an analog phone.
 43. The method of claim 30 wherein a calleris a subscriber and further comprising: maintaining profile settings ofa subscriber, the profile settings used for providing advertisements tothe subscriber.
 44. The method of claim 30 wherein the advertisement isdisplayed on a caller's mobile device.
 45. The method of claim 30wherein a caller may get more information concerning the advertisementby clicking on the advertisement or a hyperlink therein, with a pointingdevice, remote control, mouse, stylus or any other navigation means andview more information or web contents sent via the Internet.
 46. Themethod of claim 30 wherein the database holding the advertisementcontents holds other information comprising public service announcements(PSA) or government warnings.
 47. The method of claim 45 wherein whenthe caller gets more information by selecting the advertisement thecaller is able to send an email to the advertiser.
 48. A method ofproviding advertisements comprising: recognizing key words during aphone call; finding relevant advertisements associated with therecognized key words; and after recognizing the key words during thephone call, sending a combined advertisement and video data to asubscribers video display.